Time-Regularized Interrupting Options
نویسندگان
چکیده
High-level skills relieve planning algorithms from low-level details. But when the skills are poorly designed for the domain, the resulting plan may be severely suboptimal. Sutton et al. (1999) made an important step towards resolving this problem by introducing a rule that automatically improves a set of skills called options. This rule terminates an option early whenever switching to another option gives a higher value than continuing with the current option. However, they only analyzed the case where the improvement rule is applied once. We show conditions where this rule converges to the optimal set of options. A new interrupting Bellman operator that simultaneously improves the set of options is at the core of our analysis. One problem with the update rule is that it tends to favor lower-level skills. We introduce a regularization term that favors longer duration skills. Experimental results demonstrate that this approach can derive a good set of high-level skills even when the original set of skills cannot solve the problem.
منابع مشابه
The effect of boundary conditions on the accuracy and stability of the numerical solution of fluid flows by Lattice-Boltzmann method
The aim of this study is to investigate the effect of boundary conditions on the accuracy and stability of the numerical solution of fluid flows in the context of single relaxation time Lattice Boltzmann method (SRT-LBM). The fluid flows are simulated using regularized, no-slip, Zou-He and bounce back boundary conditions for straight surfaces in a lid driven cavity and the two-dimensional flow ...
متن کاملRegularized Autoregressive Multiple Frequency Estimation
The paper addresses a problem of tracking multiple number of frequencies using Regularized Autoregressive (RAR) approximation. The RAR procedure allows to decrease approximation bias, comparing to other AR-based frequency detection methods, while still providing competitive variance of sample estimates. We show that the RAR estimates of multiple periodicities are consistent in probabilit...
متن کاملThe inverse problem of option pricing
The Black-Scholes formula [6] provides with an elegant and simple method to price financial derivatives under the assumption that the stock price is log-normally distributed. However, the actual distribution of most assets is rarely log-normal, and theoretical prices of options with different strikes generated by the Black-Scholes formula differ from observed market prices. One way to reconcile...
متن کاملRegularized fractional derivatives in Colombeau algebra
The present study aims at indicating the existence and uniqueness result of system in extended colombeau algebra. The Caputo fractional derivative is used for solving the system of ODEs. In addition, Riesz fractional derivative of Colombeau generalized algebra is considered. The purpose of introducing Riesz fractional derivative is regularizing it in Colombeau sense. We also give a solution to...
متن کاملOptimization of Solution Regularized Long-wave Equation by Using Modified Variational Iteration Method
In this paper, a regularized long-wave equation (RLWE) is solved by using the Adomian's decomposition method (ADM) , modified Adomian's decomposition method (MADM), variational iteration method (VIM), modified variational iteration method (MVIM) and homotopy analysis method (HAM). The approximate solution of this equation is calculated in the form of series which its components are computed by ...
متن کامل